Improving Data Availability Using Combined Replication Strategy in Cloud Environment

نویسندگان

  • M. M. Javidi Department of Computer Science, Shahid Bahonar University of Kerman, Kerman, Iran.
  • N. Mansouri Department of Computer Science, Shahid Bahonar University of Kerman, Kerman, Iran.
چکیده مقاله:

As grow as the data-intensive applications in cloud computing day after day, data popularity in this environment becomes critical and important. Hence to improve data availability and efficient accesses to popular data, replication algorithms are now widely used in distributed systems. However, most of them only replicate the static number of replicas on some requested chosen sites and it is obviously not enough for more reasonable performance. In addition, the failure of request is one of the most common issue within the data centers. To compensate these problems, we, propose a new data replication strategy to provide cost-effective availability, minimize the response time of applications and make load balancing for cloud storage. The proposed replication strategy has three different steps which are the identification of data file to replicate, placing new replicas, and replacing replicas. In the first step, it finds the most requested files for replication. In the second step, it selects the best site by consideration of the frequency of requests for replica, the last time the replica was requested, failure probability, centrality factor and storage usage) for storing new replica to reduce access time. In the third step, the replacement decision is made in order to provide better resource usage. The proposed strategy can ascertain the importance of valuable replicas based on the number of accesses in future, the availability of the file, the last time the replica was requested, and size of replica. Our proposed algorithm evaluated by CloudSim simulator and results confirmed the better performance of hybrid replication strategy in terms of mean response time, effective network usages, replication frequency, degree of imbalance, and number of communications.As grow as the data-intensive applications in cloud computing day after day, data popularity in this environment becomes critical and important. Hence to improve data availability and efficient accesses to popular data, replication algorithms are now widely used in distributed systems. However, most of them only replicate the static number of replicas on some requested chosen sites and it is obviously not enough for more reasonable performance. In addition, the failure of request is one of the most common issue within the data centers. To compensate these problems, we, propose a new data replication strategy to provide cost-effective availability, minimize the response time of applications and make load balancing for cloud storage. The proposed replication strategy has three different steps which are the identification of data file to replicate, placing new replicas, and replacing replicas. In the first step, it finds the most requested files for replication. In the second step, it selects the best site by consideration of the frequency of requests for replica, the last time the replica was requested, failure probability, centrality factor and storage usage) for storing new replica to reduce access time. In the third step, the replacement decision is made in order to provide better resource usage. The proposed strategy can ascertain the importance of valuable replicas based on the number of accesses in future, the availability of the file, the last time the replica was requested, and size of replica. Our proposed algorithm evaluated by CloudSim simulator and results confirmed the better performance of hybrid replication strategy in terms of mean response time, effective network usages, replication frequency, degree of imbalance, and number of communications.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

Improving Data Availability in Mobile Environment Using Data Allocation

Data distribution is one of the crucial issues in Data Base Management Systems (DBMS) in general and in Mobile environment in Particular. It is important because, if not properly managed, it will cause reduction in data availability, which in turn causes more rejections in transactions. Replication algorithms (e.g., CCM) are used to improve data availability. However, the database replication a...

متن کامل

A Survey of Dynamic Replication Strategies for Improving Response Time in Data Grid Environment

Large-scale data management is a critical problem in a distributed system such as cloud,P2P system, World Wide Web (WWW), and Data Grid. One of the effective solutions is data replicationtechnique, which efficiently reduces the cost of communication and improves the data reliability andresponse time. Various replication methods can be proposed depending on when, where, and howreplicas are gener...

متن کامل

A Light-weight Data Replication for Cloud Data Centers Environment

Unlike traditional high performance computing environment, such as supercomputers, the cloud computing is a collection of interconnected and virtualized computing resources that are managed to be one or more unified computing resources. The Cloud environment constitutes a heterogeneous and a highly dynamic environment. Failures on the data centers storage nodes are normal rather than exceptiona...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}


عنوان ژورنال

دوره 15  شماره 3

صفحات  282- 293

تاریخ انتشار 2019-09

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

کلمات کلیدی

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023